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Abstract 

Aim: Negative appendectomy is costly for both the patient and the health system; furthermore, it may introduce the patient to unnecessary surgical intervention 
and possible complications. This study aims to determine the most suitable one by comparing nine of the most popular appendicitis scoring systems. 
Material and Methods: The study included 170 patients who were histopathologically diagnosed with appendicitis in the last year (Group 1), and 143 patients 
without appendicitis in the last five years (Group 2). The variables required to calculate scoring systems for the prediction of acute appendicitis were saved to 
the study datasheet, and each patient’s score was calculated for each scoring system with the formulated excel file automatically. 

Results: Among all scoring systems, the Karaman score was most efficacious at predicting appendicitis. The positive predictive value of the Karaman score 
was 89.9%, whereas the negative predictive value was 57.9%. The Alvarado score performed the best among the scoring systems. This was associated with a 
positive predictive value of 89.5%, negative predictive value of 85.7%, and sensitivity and specificity of 67.6% and 84.1%, respectively. 

Discussion: The use of suitable scoring systems with or without imaging modalities or may reduce negative appendectomy rates. 
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Introduction 

Acute appendicitis (AA) is the most common diagnosis requiring 
emergency surgery worldwide, with a lifetime risk of 8.6% in 
males and 6.7% in females [1,2]. Despite developing laboratory 
and imaging methods, negative appendectomy rates vary 
between 3-25% in various publications [3-5]. The diagnosis 
of acute appendicitis is typically concluded by combining the 
patient’s history, physical examination findings, and laboratory 
parameters within imaging modalities, when necessary. 
Unfortunately, a patient encountered in the emergency unit 
may set the surgeon in a tricky circumstance due to the lack 
of physical examination, and laboratory findings do not fully 
corresponds to acute appendicitis. 

In challenging cases, the use of ultrasonography (USG) and 
computed tomography (CT) promotes the diagnosis. In the 
literature, the overall sensitivity and specificity of US are 76% 
and 95%, respectively, and 99% and 84% for CT, respectively. 
However, these imaging methods may not be simply accessible, 
as well as leading to high costs. 

A negative appendectomy (NA) is costly for both the patient 
and the health system [6]; furthermore, it may introduce the 
patient to unnecessary surgical intervention and _ possible 
complications. To date, clinical scoring systems (CSS) have 
been defined to assist the diagnosis of AA [6-14]. These clinical 
scoring systems generally aim to clarify the diagnosis of AA 
based on scaled parameters of patients’ examination findings 
and laboratory values. In the diagnosis of AA, the most precious 
of the CSSs should be the scoring system that reduces the 
rates of negative appendicitis as well as diagnosing positive 
appendicitis. Therefore, the study aims to determine the lowest 
rate of negative appendicitis among the nine popular scoring 
systems, as well as to reveal an effective scoring system that 
diagnoses positive appendicitis. Furthermore, the current study 
is the first in the literature to compare nine different CSSs. 


Material and Methods 

The current retrospective comparative study included 170 
patients whose pathology was confirmed as acute appendicitis 
in the last year at our institute, and 143 patients in whom 
the pathology did not reveal acute appendicitis from 1314 
appendectomies in the previous five years. The ethics committee 
of our university approved the study with ethics number 
E-71522473-050.01.04-47683-410. Patients <18 years of age, 
pregnant patients, patients with existing malignancy, patients 
using steroids for any reason, immunosuppressive patients, 
COVID-19 positive patients, were excluded from the study. 
Each patient was examined by a consultant surgeon within 
the first hour after admitting to the emergency department. 
Following the laboratory analyzes of each patient, abdominal 
USG was performed on all patients. In the case of doubt 
about the diagnosis, an abdominal CT was performed. Lack 
of appendicitis in pathology reports was considered negative 
appendectomy. The histological diagnosis of appendicitis 
was set according to the infiltration of muscularis propria 
with neutrophils granulocytes. Variables required to calculate 
scoring systems (Alvarado, Raja Isteri Pengiran Anak Saleha 
Appendicitis (Ripasa), Tzanakis, Appendicitis inflammatory 
response (AIR), Eskelinen, Ohmann, Lintula, Fenyo-Lindberg, 


and Karaman scoring systems) for the prediction of acute 
appendicitis were saved to the study datasheet, and each 
patient’s score was calculated for each scoring system with 
the formulated excel file automatically. The patients were 
divided into two groups as Group 1 positive appendicitis group 
and Group 2 as negative appendicitis group. Appendectomies 
were performed using conventional or laparoscopic methods. 
Statistical analysis was performed to compare the success of 
the scoring systems between groups. 

Statistical analysis 

Descriptive analyzes were performed to provide information 
on the general characteristics of the study population. The 
Kolmogorov-Smirnov test was used to evaluate whether the 
distributions of numerical variables were normal. Accordingly, 
the independent sample t-test and the Kruskal Wallis test 
were used to compare the numeric variables between groups. 
Numerical variables were presented as mean + 
deviation. Categorical variables were compared using the Chi- 
Square test. Categorical variables were presented as a count 
and percentage. A p-value <0.05 was considered significant. 
Receiver operator characteristic (ROC) curve analysis was used 
to identify the best cut-off value and assess the performance of 
the test score for appendicitis. Analyses were performed using 
SPSS statistical software (IBM SPSS Statistics, Version 23.0. 
Armonk, NY: IBM Corp.) 


standard 


Results 

No difference was determined between the two groups included 
in the study in terms of age distribution. However, among 
laboratory parameters, white blood count (WBC), neutrophil 
percentage, PMNL rate were statistically high in Group 1 
(Table 1). Total bilirubin value, on the other hand, was found 
to be high in group 2. Contrary to many studies, the C-reactive 
protein (CRP) value was high in group 1, but it did not reveal 
a Statistically significant difference across group 2 (Table 1). 
Acute appendicitis was diagnosed in 82.9% of the patients who 
underwent USG. The consistency of male patients in group 1 and 
female patient distribution in group 2 is statistically significant. 
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Figure 1. ROC curves for diagnostic performance of appendicitis 
scoring systems 
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Abdominal pain severity was evaluated high in Group 1 at a rate 
of 31.2% and assessed as mild in Group 2 at a rate of 43%. 
The increase in pain created a significant difference in Group 
2 compared to Group 1. Besides, the duration of symptoms of 


Table 1. Distribution of features related to appendicitis and not 
appendicitis groups. 


Not 
Append s 


Appendicitis 


170 (54.3%) 143 (45.7%) 


Age 33 (17-83) 33 (18-79) 0.60 
Neutrophil 10.45+4.58 5.5742.82 <0.05 
MPV 8.1741.47 7.774123 0.27 
Total Bilirubin 0.96 + 0.64 1.26+8.26 <0.05 
CRP 60.14+81.67 42.25+42.61 0.84 
WBC 14730.9+11449.84  11842+4491.67  —<0.05 
PMNL Ratio 75.91413.9 63.64412.67 <0.05 
Neescvelurine No 13 (18.3%) 58 (81.7%) See 
Analysis Yes 157 (64.9%) 85 (35.19%) 
No 15 (12%) 110 (88%) 
App. In USG <0.05 
Yes 155 (82.9%) 33 (17.1%) 
Male 110 (64.7%) 54 (37.3%) 
Gender <0.05 
Female 60 (35.3%) 89 (62.7%) 
No 161 (56.1%) 126 (43.9%) 
Fever 0.035 
Yes 9 (34.6%) 17 (65.4%) 
Right Lower quad- YS 164 (96.5%) 116 (81.0%) ae 
beta ipetin No 6 (3.5%) 27 (19.0%) 
Mild 30 (17.6%) 62 (43.0%) 
The severity of pain Moderate 87 (51.2%) 69 (48.6%) <0.05 
High 53 (31.2%) 12 (8.5%) 
Pain outside No 126 (74.1%) 88 (61.3%) 
the right lower 0.41 
quadrant Yes 44 (25.9%) 55 (38.7%) 
No 76 (44.7%) 30 (20.4%) 
Increasing pain <0.05 
Yes 94 (55.3%) 113 (79.6%) 
Stent No 80 (47.1%) 69 (47.9%) 
Umm Dilicusio1R EO Mayes 90 (52.99%) 74 (52.1%) 
No 109 (64.1%) 89 (62.7%) 
Vomiting 0.72 
Yes 61 (35.9 %) 54 (37.3%) 
No 58 (34.1%) 59 (40.8%) 
Loss of appetite 0.20 
Yes 112 (65.9%) 84 (59.2%) 
<24h 89 (52.4%) 67 (47.2%) 
Duration of symp- oy ag hh 38 (22.4%) 61 (42.3%) <0.05 
toms 
>48h 43 (25.3%) 15 (10.6%) 
RLQ pain with No 56 (32.9%) 34 (23.9%) ae 
coughing Yes 114 (67.1%) 109 (76.1%) 
None 6 (3.5%) 0 (0%) 
Bowel sounds with Normal 161 (94.7%) 126 (88.0%) <0.05 
auscultation 
Hyperactive 3 (1.8%) 17 (12.0%) 
None 35 (20.6%) 46 (32.4%) 
Mild 20 (1.8%) 39 (26.8%) 
Rigidity <0.05 
Moderate 74 (43.5%) 43 (30.3%) 
High 41 (24.1%) 15 (10.6%) 
No 4 (2.4%) 27 (19.9%) 
Tenderness <0.05 
Yes 166 (97.6%) 116 (81.0%) 
No 45 (28.3%) 85 (60.0%) 
Rebound <0.05 
Yes 114 (71.7%) 56 (40.0%) 
No 101 (59.4%) 76 (52.8%) 
Rovsing 0.34 
Yes 69 (40.6%) 67 (47.2%) 


MPV mean platelet volume, CRP C-reactive protein, PMNL polymorphonuclear leukocyte, 
App appendicitis, RLQ right lower quadrant 


Table 2. Distribution of appendicitis diagnostic performance 
criteria of scoring systems 


Specificity Sensitivity 


Karaman 9765 45 © 0.899 0.579 0.739 0.676 <0.05 
Score 
WNEIEEE) Gees) ORR REY 0.841 0.676 <0.05 
Score 
Ripasa 0.645 9175 0642 0.947 0.572 0.607 <0.05 
Score 
WANE oco3 65 @OFG? MRD 0.536 0.717 <0.05 
Score 
AIRS 0749 55 0.898 0.800 0.688 0.676 <0.05 
Eskelinen 9788 61.47 0.599 0.871 0.667 0.676 <0.05 
Score 
Ohmann 9794 12.25 0658 0.893 0.703 0.740 <0.05 
Score 
HE! ess) GS COREA MERE 0.623 0.618 <0.05 
Score 
Fenyo 
Lindberg 0.719 45 0.654 0.627 0.659 0.665 <0.05 
Score 


AUC: Area under curve, PPV: Positive predictive value, NPV: Negative predictive value 


24-48 hours is statistically high in Group 2. Hyperactive bowel 
sounds, moderate abdominal rigidity, tenderness, and absence 
of rebound are statistically significantly less in Group 2. 

When the nine clinical scoring systems included in the study 
were compared, the Ohmann score had the highest sensitivity 
with a rate of 74% (Table 2). When the area under the curve 
(AUC) was evaluated, the Alvarado score was determined first, 
and the Ohmann score was in the second line (Figure 1). 


Discussion 

Despite advanced imaging methods, negative appendectomy 
rates have not decreased to the desired level in diagnosing 
appendicitis. Various examination and laboratory findings, 
including the female gender, are among the main reasons that 
make the diagnosis of appendicitis challenging. Unfortunately, 
many surgeons may face the dilemma of not performing 
unnecessary appendectomy or causing perforation due to 
delayed diagnosis in their professional practice. In the current 
study, our negative appendectomy rate was 10.88%, consistent 
with the literature. 

The use of suitable scoring systems with or without imaging 
modalities or may reduce negative appendectomy rates. 
Furthermore, negative appendectomy can consequence in 
significant morbidity and mortality, such as prolonged hospital 
stay, postoperative surgical site infections, and even mortality 
[15]. On the other hand, diagnostic laparoscopy performed in 
suspicious cases also carries the risks of general anesthesia, 
and many surgeons eventually perform an appendectomy. The 
issue is that with or without inflammation of the appendix, 
postoperative complications are similar [16]. In this context, the 
described clinical scoring systems may effectively promote the 
diagnosis of appendicitis and reduce negative appendectomy 
rates in areas where imaging methods are difficult to access. 
Previous studies reveal that scoring systems reduce negative 
appendectomy rates [10,1 7-21]. The Alvarado score is the first 
of these scoring systems, and a score consisting of symptoms, 
signs, and laboratory parameters leads to diagnosis. In addition 
to similar parameters, different variables combined in the 
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scoring systems defined later. Examples include gender in 
Fenyo-Lindberg, urinary symptoms in RIPASA and Ohmann 
scores, and the use of USG in the Tzanakis score. The Karaman 
score was the highest PPV value in the current study, and the 
RIPASA score had the highest NPV value. Moreover, Alvarado’s 
score was the most specific, while Ohmann’s score was the 
most sensitive (Figure 1). Based on AUC analysis, the Alvarado 
and Ohmann scores produced the highest AUC values among 
the nine scoring systems, while RIPASA and Lintula scores had 
the lowest AUC values (Figure 1). Considering the low valued 
AUC scoring systems, gender among the evaluation parameters 
draw attention. The inadequacy of this variable effects may be 
due to the presence or ratio of this parameter. 

A significantly young female gender increases negative 
appendectomy rates due to numerous gynecological pathologies 
accompanying the differential diagnosis [22]. In addition, 
scoring systems are also affected by age and geographic 
population [23]. Unsurprisingly, in our study, the majority of 
group 2 consisted of the female gender. 

Leukocytosis is the most critical laboratory value for the 
diagnosis of acute appendicitis so for scoring systems. In 
many studies, it has been stated that biochemical markers 
(C-Reactive Protein, bilirubin) may be helpful in the diagnosis of 
acute appendicitis. The fact that these markers differ in each 
study limits their routine use in diagnosing acute appendicitis; 
as expected, these values are not among the parameters in 
any scoring system. Our analysis has once again confirmed the 
influence of leukocytosis in acute appendicitis. 

Although imaging modalities such as USG and CT are available 
to aid diagnosis and differential diagnosis, their utility is 
limited in accessibility, exposure to ionizing radiation, and 
cost-effectiveness. As evaluated in a previous study, negative 
appendectomy rates could not be reduced with the contribution 
of USG, but could be decreased at least by 8.3% with CT[24]. 
Precisely for this reason, it would not be off- target to state that 
the tomographic evaluation assists in the differential diagnosis 
further on the diagnosis of appendicitis. The USG was performed 
in all patients who were evaluated with a preliminary diagnosis 
of acute appendicitis. Patients still suspicious for appendicitis 
despite examination, laboratory data, and ultrasound had a CT 
scan. Despite this, unfortunately, our negative appendectomy 
rate was 10.88%. 

From this point of view, combining CSS with ultrasound or 
tomography may reduce negative appendectomy rates. In 
the study conducted by Althoubaity et al., the higher rate of 
negative appendectomy detection (22%) using only USG in 
female patients showed that USG and scoring systems were 
more significant in females [24]. Likewise, in another recently 
published study, Ohmann, Lintula, and Ripasa reported that 
combining scoring systems with ultrasound could reduce the 
negative appendectomy rate up to 4% in female patients [25]. 
In addition, another study pointed out that, besides gender, age 
groups and geographic population also affect scoring systems 
in another study [23]. 

However, in hospitals where access to imaging methods is 
inconvenient or in cases where the cost cannot be condensed, 
CSS will significantly contribute to the diagnosis of appendicitis. 
In centers with no restrictions such as cost or accessibility in 


the usage of imaging methods, the combination of USG and/or 
CT with CSS may reduce negative appendectomy rates. In order 
to achieve the appropriate outcome, a CSS should be chosen 
as a result of studies carried out that minimize both gender, 
age, and other effects and can reduce negative appendectomy 
rates. In the current study, Alvarado and Ohmann score systems 
were the most effective scoring systems preventing negative 
appendectomy. 

Study limitations 

Among the limitations of our study, in addition retrospective 
analysis, clinical scoring systems were compared not among 
patients who presented to the emergency department with 
abdominal pain but between groups who had appendectomy. 

In conclusion, the use of appropriate CSS where access to 
imaging modalities is difficult, or in suitable centers addition 
to these modalities, may reduce negative appendectomy rates 
in the diagnosis of appendicitis. According to our analysis, 
Alvarado’s and Ohmann score systems are the most relevant 
tests for this issue. 
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